Goto

Collaborating Authors

 vertex ai


Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety

Melton, Chad, Sorokine, Alex, Peterson, Steve

arXiv.org Artificial Intelligence

Evaluating Retrieval A ugmented G enerative Models for Document Queries in Transportation Safety C.A. Melton, A. Sorokine, S. Peterson Oak Ridge National Laboratory, Oak Ridge, TN, United States National Security Sciences Directorate ABSTRACT Applications of generative Large Language Models (LLMs) are rapidly expanding across various domains, promising significant improvements in workflow efficiency and information retrieval. However, their implementation in specialized, high - stakes domains suc h as hazardous materials transportation is challenging due to accuracy and reliability concerns. This study evaluates the performance of three fine - tuned generative models -- ChatGPT, Google's Vertex AI, and ORNL Retrieval - Augmented Generation augmented LLaMA 2 and LLaMA in retrieving regulatory information essential for hazardous material transportation compliance in the United States. Utilizing approximately 40 publicly available federal and state regulatory documents, we developed 100 realistic queries relevant to route planning and permitting requirements. Responses were qualitatively rated based on accuracy, detail, and relevance, complemented by quantitative assessments of semantic similarity between model outputs. Results demon strated that the RAG - augmented LLaMA models significantly outperformed Vertex AI and ChatGPT, providing more detailed and generally accurate information, despite occasional inconsistencies. This research introduces the first known application of RAG in tra nsportation safety, emphasizing the need for domain - specific fine - tuning and rigorous evaluation methodologies to ensure reliability and minimize the risk of inaccuracies in high - stakes environments.


RAG based Question-Answering for Contextual Response Prediction System

Veturi, Sriram, Vaichal, Saurabh, Jagadheesh, Reshma Lal, Tripto, Nafis Irtiza, Yan, Nian

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks, including their potential as effective question-answering systems. However, to provide precise and relevant information in response to specific customer queries in industry settings, LLMs require access to a comprehensive knowledge base to avoid hallucinations. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. Yet, developing an accurate question-answering framework for real-world applications using RAG entails several challenges: 1) data availability issues, 2) evaluating the quality of generated content, and 3) the costly nature of human evaluation. In this paper, we introduce an end-to-end framework that employs LLMs with RAG capabilities for industry use cases. Given a customer query, the proposed system retrieves relevant knowledge documents and leverages them, along with previous chat history, to generate response suggestions for customer service agents in the contact centers of a major retail company. Through comprehensive automated and human evaluations, we show that this solution outperforms the current BERT-based algorithms in accuracy and relevance. Our findings suggest that RAG-based LLMs can be an excellent support to human customer service representatives by lightening their workload.


Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval

Chen, Yanfei, Yoon, Jinsung, Sachan, Devendra Singh, Wang, Qingze, Cohen-Addad, Vincent, Bateni, Mohammadhossein, Lee, Chen-Yu, Pfister, Tomas

arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have enabled autonomous agents with complex reasoning and task-fulfillment capabilities using a wide range of tools. However, effectively identifying the most relevant tools for a given task becomes a key bottleneck as the toolset size grows, hindering reliable tool utilization. To address this, we introduce Re-Invoke, an unsupervised tool retrieval method designed to scale effectively to large toolsets without training. Specifically, we first generate a diverse set of synthetic queries that comprehensively cover different aspects of the query space associated with each tool document during the tool indexing phase. Second, we leverage LLM's query understanding capabilities to extract key tool-related context and underlying intents from user queries during the inference phase. Finally, we employ a novel multi-view similarity ranking strategy based on intents to pinpoint the most relevant tools for each query. Our evaluation demonstrates that Re-Invoke significantly outperforms state-of-the-art alternatives in both single-tool and multi-tool scenarios, all within a fully unsupervised setting. Notably, on the ToolE datasets, we achieve a 20% relative improvement in nDCG@5 for single-tool retrieval and a 39% improvement for multi-tool retrieval.


Google indemnifies generative AI customers over IP rights claims

InfoWorld News

Google announced on Thursday that it will protect its generative AI customers against any intellectual property claims made on the data used or output served by Google-hosted AI models. By extending protection in its cloud and workspace environments, Google joins the list of technology firms that have recently announced IP support for using their own generative AI tools. These include companies like IBM, Microsoft, Amazon, and Adobe. Google said the protection would span across all Google environments using the Duet AI collaborator, and the company's homegrown generative AI engine Vertex AI. The indemnity clause by leading technology companies will likely bring in hope as generative AI's challenges over privacy, security, and intellectual property violations peak.


Sharing Google's Med-PaLM 2 medical large language model, or LLM

#artificialintelligence

While we'll have some innovations like Med-PaLM 2 that are tuned for healthcare, we also have products that are relevant across industries. Last month, we announced several generative AI capabilities coming to Google Cloud, including Generative AI support in Vertex AI and Generative AI App Builder, which are already being tested by a number of customers. Developers and businesses already use Vertex AI to build and deploy machine learning models and AI applications at scale, and we recently added Generative AI support in Vertex AI. This gives customers foundation models they can fine-tune with their own data, and the ability to deploy applications with this powerful new technology. We also launched Generative AI App Builder to help organizations build their own AI-powered chat interfaces and digital assistants in minutes or hours by connecting conversational AI flows with out-of-the-box search experiences and foundation models.


How Do I Speed Up My Tensorflow Transformer Models? - Liwaiwai

#artificialintelligence

Transformer models have gained much attention in recent years and have been responsible for many of the advances in Natural Language Processing (NLP). Transformer models have often replaced Recurrent Neural Networks for many use cases like machine translation, text summarization, and document classification. For organizations, it can be challenging to deploy transformer models in production and perform inference because inference can be expensive, and the implementation can be complex. Recently we announced the public preview for a new runtime that optimizes serving TensorFlow (TF) models on the Vertex AI Prediction service. We are happy to announce that the optimized Tensorflow runtime is now GA.


Solving For The Next Era Of Innovation And Efficiency With Data And AI - cyberpogo

#artificialintelligence

Even in today's changing business climate, our customers' needs have never been more clear: They want to reduce operating costs, boost revenue, and transform customer experiences. Today, at our third annual Google Data Cloud & AI Summit, we are announcing new product innovations and partner offerings that can optimize price-performance, help you take advantage of open ecosystems, securely set data standards, and bring the magic of AI and ML to existing data, while embracing a vibrant partner ecosystem. In the face of fast-changing market conditions, organizations need smarter systems that provide the required efficiency and flexibility to adapt. That is why today, we're excited to introduce new BigQuery pricing editions along with innovations for autoscaling and a new compressed storage billing model. BigQuery editions provide more choice and flexibility for you to select the right feature set for various workload requirements.


Vertex AI Foundations For Secure And Compliant ML/AI Deployment - cyberpogo

#artificialintelligence

An increasing number of Enterprise customers are adopting ML/AI as their core transformational pillars, in order to differentiate, increase revenue, reduce costs and maximize efficiency. For many customers ML/AI adoption can be a challenging endeavor not only because of the broad spectrum of applications ML/AI can support, deciding on which one to prioritize can be a challenge, but because moving these solutions into production require a series of security, access and data assessments and features that some ML/AI platforms might not have. This blog post focuses on how to set up your Cloud foundations to cater specifically to the Vertex AI platform and its configuration to be able to set up proper Vertex AI foundations for your future machine learning operations (MLOps) and ML/AI use cases. Explainability is not covered in this blog post, but as a practitioner it is one of the key components for any production ready ML system to take it into account. You can take a look at Vertex Explainable AI for a more in depth approach on feature based explanations, feature attributions methods (Sampled Shapley, Integrated methods and XRAI) and differentiable and non-differentiable models.


Two Towers Model: A Custom Pipeline in Vertex AI Using Kubeflow

#artificialintelligence

MLOps is composed by Continuous Integration (CI -- code, unit testing, remerge code), Continuous Delivery (CD -- build, test, release) and Continuous Training (CT -- train, monitor, measure, retrain, serve). Consider the following situation: you develop a solution where you will offer product search for users. There are new users every minute and new products every day. In this situation we will have an index of embeddings containing all the products, and users query will be submitted as numerical vectors to this index, to check for the best results. This index is deployed in a container inside Vertex AI endpoints.


GCP Vertex AI - For Me

#artificialintelligence

Hi everyone, just to note that Machine Learning is not my expertise at all and I did not have any particular professional experience on ML. The real reason why I am interested in exploring Vertex AI is that through my Mentoring session with the Bangkit 2021 team one of the most frequently asked questions is around managing Machine Learning and how to deploy it especially for the student (they are still pursuing their academic degree). Now when I heard about Vertex AI and showcased it on Google I/O makes me really wonder how we can utilize the platform to ease the entry barrier of people to actually adopt ML. I have not yet thoroughly explored the entire parameter nor have the expertise to review it, but from the standpoint of a newcomer in Machine Learning, this is superb. In a nutshell, I can see that there are a lot of things that we can actually do with Vertex AI but the thing that is important for me right now is actually to manage my dataset, train my models, and actually deploy it to be consumed.